# Data and Experimental Code

This repository contains the datasets and experimental code for a thematic classification study. The dataset is divided into a training set and a test set, used for training and evaluating various machine learning models.

## Dataset

The data is divided into two parts:
- **Training Set**: Stored in `train.xlsx`
- **Test Set**: Stored in `test.xlsx`

Each sample is labeled with a number from 0 to 6, corresponding to the following seven topic categories:
- `0`: Love and Marriage
- `1`: Friendship and Farewell
- `2`: Journey and Homesickness
- `3`: Frontier and War
- `4`: Landscape and Countryside
- `5`: History and Nostalgia
- `6`: Expressing emotions through concrete objects

## Experimental Code

The main entry file for the experiments is `main.py`, which includes the following functions:

- `run_base_learners()`: For training base learners
- `run_ensemble_ml_learners()`: For training traditional ensemble learners
- `run_ensemble_MLP_learners()`: For training ensemble learners based on Multi-Layer Perceptrons (MLP)

All the implementations of the learners are in the `Learners.py` file.

### Technology Stack
- **Deep Learning Framework**: PyTorch
- **Machine Learning Framework**: Scikit-learn
- **Pre-trained Models**: Provided by the `transformers` library

## Installation

To run these experimental codes, you need to install the necessary Python packages:

```bash
pip install torch sklearn transformers
